Combining lexical, syntactic and prosodic cues for improved online dialog act tagging

نویسندگان

  • Vivek Kumar Rangarajan Sridhar
  • Srinivas Bangalore
  • Shrikanth S. Narayanan
چکیده

Prosody is an important cue for identifying dialog acts. In this paper, we show that modeling the sequence of acoustic– prosodic values as n-gram features with a maximum entropy model for dialog act (DA) tagging can perform better than conventional approaches that use coarse representation of the prosodic contour through summative statistics of the prosodic contour. The proposed scheme for exploiting prosody results in an absolute improvement of 8.7% over the use of most other widely used representations of acoustic correlates of prosody. The proposed scheme is discriminative and exploits context in the form of lexical, syntactic and prosodic cues from preceding discourse segments. Such a decoding scheme facilitates online DA tagging and offers robustness in the decoding process, unlike greedy decoding schemes that can potentially propagate errors. Our approach is different from traditional DA systems that use the entire conversation for offline dialog act decoding with the aid of a discourse model. In contrast, we use only static features and approximate the previous dialog act tags in terms of lexical, syntactic and prosodic information extracted from previous utterances. Experiments on the Switchboard-DAMSL corpus, using only lexical, syntactic and prosodic cues from three previous utterances, yield a DA tagging accuracy of 72% compared to the best case scenario with accurate knowledge of previous DA tags (oracle), which results in 74% accuracy. ! 2009 Elsevier Ltd. All rights reserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exploiting prosodic features for dialog act tagging in a discriminative modeling framework

Cue-based automatic dialog act tagging uses lexical, syntactic and prosodic knowledge in the identification of dialog acts. In this paper, we propose a discriminative framework for automatic dialog act tagging using maximum entropy modeling. We propose two schemes for integrating prosody in our modeling framework: (i) Syntaxbased categorical prosody prediction from an automatic prosody labeler,...

متن کامل

Lexical, Prosodic, And Syntactic Cues For Dialog Acts

The structure of a discourse is reflected in many aspects of its linguistic realization, including its lexical, prosodic, syntactic, and semantic nature. Multiparty dialog contains a particular kind of discourse structure, the dialog act (DA). Like other types of structure, the dialog act sequence of a conversation is also reflected in its lexical, prosodic, and syntactic realization. This pape...

متن کامل

Production of English Lexical Stress by Persian EFL Learners

This study examines the phonetic properties of lexical stress in English produced by Persian speakers learning English as a foreign language. The four most reliable phonetic correlates of English lexical stress, namely fundamental frequency, duration, intensity, and vowel quality were measured across Persian speakers’ production of the stressed and unstressed syllables of five English disyllabi...

متن کامل

Dialog Act Modeling for Automatic Tagging and Recognition of Conversational Speech

We describe a statistical approach for modeling dialog acts in conversational speech, i.e., speechact-like units such as Statement, Question, Backchannel, Agreement, Disagreement, and Apology. Our model detects and predicts dialog acts based on lexical, collocational, and prosodic cues, as well as on the discourse coherence of the dialog act sequence. The dialog model is based on treating the d...

متن کامل

Reliability of Lexical and Prosodic Cues in Two Real-life Spoken Dialog Corpora

The present research focuses on analyzing and detecting emotions in speech as revealed by task-dependent spoken dialogs corpora. Previously, we have conducted several experiments on a real-life corpus in order to develop a reliable annotation method and to detect lexical and prosodic cues correlated to the main emotion class. In this paper we evaluate both the robustness of the annotation schem...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Computer Speech & Language

دوره 23  شماره 

صفحات  -

تاریخ انتشار 2009